ATOM Documentation

← Back to App

Feature Parity Summary: LLM Service & Pricing Architecture

**Date**: 2026-03-31

**Status**: ✅ Complete

Architecture Overview

Both repositories now have identical multi-layer LLM service architecture:

┌─────────────────────────────────────────────────────────────┐
│                    Application Layer                         │
├─────────────────────────────────────────────────────────────┤
│  LLMService (core/llm_service.py)                           │
│  - Unified interface for all LLM interactions               │
│  - Wraps BYOKHandler with additional abstractions           │
│  - Provider/model enums, structured outputs                 │
└─────────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│                   BYOK Handler Layer                         │
├─────────────────────────────────────────────────────────────┤
│  BYOKHandler (core/llm/byok_handler.py)                     │
│  - Multi-provider routing & fallback                        │
│  - Cost tracking & optimization                             │
│  - Rate limiting & circuit breakers                         │
│  - Cognitive tier integration                               │
└─────────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│                  Dynamic Pricing Layer                       │
├─────────────────────────────────────────────────────────────┤
│  DynamicPricingFetcher (core/dynamic_pricing_fetcher.py)    │
│  - Fetches live pricing from LiteLLM GitHub                 │
│  - Fallback to OpenRouter API                               │
│  - 24-hour local cache with auto-refresh                    │
│  - 2000+ AI model prices                                    │
└─────────────────────────────────────────────────────────────┘

API Endpoints (Feature Parity Achieved)

1. BYOK Routes (`/api/byok/*`)

**Location**: api/byok_routes.py

EndpointMethodDescription
/api/byok/providersGETList all AI providers with status
/api/byok/keysGET/POSTManage API keys
/api/byok/usageGETGet usage statistics
/api/ai/pricingGETGet current model pricing from cache
/api/ai/pricing/refreshPOSTForce refresh pricing from external APIs
/api/ai/pricing/model/{model}GETGet pricing for specific model
/api/ai/pricing/provider/{provider}GETGet all models for provider
/api/ai/pricing/estimatePOSTEstimate cost for a request

2. LLM Registry Routes (`/api/llm-registry/*`)

**Location**: api/llm_registry_routes.py

EndpointMethodDescription
/api/llm-registry/provider-healthGETHealth status for providers
/api/llm-registry/models/by-qualityGETFilter models by quality score
/api/llm-registry/models/searchGETSearch models by name/capability
/api/llm-registry/providers/listGETList all providers
/api/llm-registry/sync-qualityPOSTSync quality scores from LMSYS

3. Cognitive Tier Routes (`/api/cognitive-tiers/*`)

**Location**: api/cognitive_tier_routes.py

EndpointMethodDescription
/api/cognitive-tiers/preferencesGET/POST/PUTManage tier preferences
/api/cognitive-tiers/estimate-costPOSTEstimate cost per tier
/api/cognitive-tiers/compareGETCompare tiers (quality vs cost)

External Pricing APIs

The system fetches real-time pricing from:

  1. **LiteLLM Model Prices** (Primary)
  • URL: https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json
  • 2000+ models with input/output costs
  • Updated regularly by LiteLLM maintainers
  1. **OpenRouter API** (Fallback)
  • URL: https://openrouter.ai/api/v1/models
  • Additional models not in LiteLLM
  • Real-time pricing data

Files Copied to Open-Source

Core Modules

  • core/llm_service.py (already existed, verified parity)
  • core/llm/byok_handler.py (already existed)
  • core/dynamic_pricing_fetcher.py (already existed)
  • core/llm/registry/ (entire directory - 16 files)
  • core/cache.py (UniversalCacheService)
  • core/schemas.py (ApiResponse and other schemas)
  • core/validation.py (validation utilities)
  • core/config.py (added settings alias)

API Routes

  • api/byok_routes.py (1359 lines - includes pricing endpoints)
  • api/llm_registry_routes.py (267 lines - new file)
  • api/cognitive_tier_routes.py (already existed)

Main App Registration

  • main_api_app.py - BYOK routes registered

Provider Support

Both repos now support 11 providers:

  • OpenAI, Anthropic, Google (Gemini), Meta (Llama)
  • Mistral, Cohere, DeepSeek, MiniMax
  • Qwen, Zhipu (GLM), Groq

Testing

Import Tests ✅

✓ LLMService imports correctly
✓ BYOKHandler imports correctly
✓ DynamicPricingFetcher imports correctly
✓ LLMRegistryService imports correctly
✓ BYOK routes imports correctly
✓ LLM Registry routes imports correctly

Component Tests ✅

TEST 1: DynamicPricingFetcher
✓ Fetcher initialized
  - Cache ready (needs refresh for latest prices)

TEST 2: LLMService
✓ LLMService initialized
  - Workspace ID: default
  - Tenant ID: default
  - Handler type: BYOKHandler
  - Provider detection working (gpt-4o→openai, claude→anthropic, etc.)

TEST 3: BYOKHandler
✓ BYOKHandler initialized
  - Providers configured

TEST 4: LLM Registry Service
✓ LLMRegistryService imports OK
✓ ProviderHealthService imports OK
  - Provider health check: 2 providers checked
    - openai: healthy
    - anthropic: healthy

API Endpoint Tests ✅

1. GET /api/ai/pricing
   Status: 200 ✓
   Success: True
   Returns: model_count, last_updated, cache_valid, cheapest_models

2. GET /api/llm-registry/provider-health
   Status: 200 ✓
   Providers checked: 7
     - openai: healthy
     - anthropic: healthy
     - google: healthy

3. GET /api/llm-registry/providers/list
   (Requires DB setup - model relationship issue unrelated to new code)

Pricing Cache Status

The pricing cache is empty initially. To populate it:

curl -X POST http://localhost:8000/api/ai/pricing/refresh?force=true

This will fetch 2000+ model prices from LiteLLM GitHub and cache them.

Key Differences from Initial Understanding

**Initial Question**: "There's an API endpoint to get latest pricing, is that the registry?"

**Answer**: No - they are **separate but complementary** systems:

  1. **LLM Registry** = Model metadata, quality scores, health monitoring
  2. **BYOK Pricing Endpoints** = Live cost data from external APIs
  3. **LLMService** = Application-layer wrapper around BYOKHandler

The pricing endpoints (/api/ai/pricing/*) are part of BYOK routes, NOT the LLM Registry.

Next Steps (Optional Enhancements)

  • [ ] Add automated pricing sync job (hourly/daily)
  • [ ] Implement cost alerting when budgets exceeded
  • [ ] Add provider performance tracking (latency, success rate)
  • [ ] Create admin dashboard for pricing monitoring

Conclusion

✅ **Feature parity achieved** between SaaS and Open-Source repositories for:

  • LLM Service abstraction layer
  • BYOK handler with multi-provider support
  • Dynamic pricing from external APIs
  • LLM Registry with quality scores
  • Cognitive tier management
  • All related API endpoints

Both codebases now have identical capabilities for cost-aware AI routing, provider health monitoring, and model quality tracking.